This data is a demo (using a selective subset out of a total of 377 species) of a data analysis done to aid the selection of a set of indicator species for a coral reef monitoring program in the Philippines. The data was used for practical aid in selecting interesting indicator species by comparing statistical differences between species occurance with functional characteristics of these species. The information was used to make informed choices on which species to select for a monitoring program, but was not intended to be used to answer any hard scientific questions.
The following analyses are data distribution analyses and analysis of variance for the different species sampled on species level. The independent variable is the site name, while the test variable is the randomised total score for each species.
The randomised total score was collected by (non-random) swims (using SCUBA) of 50 minutes on 15 different sites, with 6 samples per site. The swim was divided in 5 blocks of 10 minutes, and presence/absence for each species was recoreded per 10 minute block. The presence/absence data was randomized over the blocks, and then points awarded to each block by multiplying the presence absence for each species/sample with c(5,4,3,2,1). This is according to the Rapid Visual Census technique (Hill and Wilkinson 2004), which reasons that species found during a random swim in the first block are likely very abundant as they take less effort to find, while species only found at the last block are probably less abundant. The randomization was done because local circumstances prevented the swims to be done in a random matter. So the randomization was done post-hoc.
Per species group, first a hierarchical cluster analysis is done with Ward clustering and binary distances based on species presence/absence per sample. This results in a clustering of species based on species that often occur together. Using the R pvclust function (Suzuki and Shimodaira 2015), significant groups are chosen with alpha=0.9. The alpha is more or less subjectively chosen while doing a visual inspection of the dendrogram to result in logical cluster sizes. This does result in clusters that have a higher chance of occuring by chance, but on the other hand we think it is good enough to have a reasonable estimate for selecting our indicator species, even though it doesn’t allow hard statistical conclusions.
Since the data is count data, with count 0 being over represented, an analysis is done first on the distribution of the total score of the species/sample. The data itself is clearly not normally distributed. The variance/mean ratio is calculated. Data that follows a poisson distribution should have a variance/mean ratio of about 1. A plot is made of the observed frequencies of each score per species and the expected frequencies according to different distributions.
Different glm models are fitted for the data for each species with the randomised score as depent variable and site name as independent variable. The best fitting model is selected based on its AIC (related to loglikelihood). Using a likelihood ratio test the model is compared to the model where the relative species abundance isn’t explained by any variable, using the same type of model. If the outcome is significant, it means that the model significantly explains the observed data better than random selection of data would. So the site has influence on the relative species abundance. In this case the individual sites are compared to the overall mean and a table is shown of all model coefficients (sites) with their influence on the overall mean and whether or not that coefficient is significant (pvalue<0.05). The table is sorted by p-value.
In the case that a zero-inflated model has the best fit, there are actually two tests: one (zero) tests whether there is a significant difference between sites in the chance of having a zero-count for that species (so a higher value for that site means less likely to encounter the species). The other test (count) tests whether there is a difference between sites in the relative abundance.
If the likelihood test shows the best fitting model is not significant, nothing more is done for the GLM. If it is significant, the table of model coefficients is shown and two graphs which allow the evaluation of the assumptions for the model. If the mean (continuous line) in the plot of fitted values vs residuals is (mostly) 0, and the normal q-qplot shows the residuals to be normally distributed, the model assumptions are valid.
As a last test on the species level, a Kruskal-Wallis non-parametric test is done, to show the same effect of site name on the species abundance, using a non-parametric test, and if the Kruskal-Wallis test is significant, followed by a pairwise Dunn post-hoc test with Bonferonni adjustment, to show the difference between sites.
Finally we analysed the Shannon-Wiener species diversity for all fish species, per family and for the selected indicator species to show how well the selected indicator species represented the overall species diversity.
Hierarchical Cluster Analysis investigating species grouping across samples based on it’s presence/absence. Dissimilarities are calculated using the Jaccard index. It calculates the dissimilarity between two species i and j by counting the amount of samples that have both species and divide it by the total amount of samples and substract that number from 1. Based on that an average hierachical clustering is done to group the species.
## Creating a temporary cluster...done:
## socket cluster with 3 nodes on host 'localhost'
## Multiscale bootstrap... Done.
Clustering diagram
Table of the species clusters
| group | species |
|---|---|
| 1 | pygoplites_diacanthus_pres |
| 1 | centropyge_vroliki_pres |
| 1 | centropyge_bicolor_pres |
| 1 | centropyge_tibicen_pres |
| 1 | chaetodon_adiergastos_pres |
| 1 | chaetodon_kleinii_pres |
| 1 | chaetodon_triangulum_pres |
| 1 | heniochus_varius_pres |
| 1 | chaetodon_vagabundus_pres |
| 2 | chaetodon_selene_pres |
| 2 | chaetodon_cittrinellus_pres |
| 3 | chaetodon_ulietensis_pres |
| 3 | chaetodon_bennetti_pres |
| 3 | chaetodon_lineolatus_pres |
| 4 | heniochus_diphreutes_pres |
| 4 | centropyge_bispinosus_pres |
| 5 | chaetodontoplus_mesoleucus_pres |
| 5 | pomacanthus_navarchus_pres |
| 5 | chaetodon_rafflesi_pres |
| 5 | coradion_melanopus_pres |
| 5 | chaetodon_melannotus_pres |
| 5 | chaetodon_oxycephalus_pres |
| 5 | chaetodon_ocellicaudus_pres |
| 5 | chaetodon_speculum_pres |
Bicolor Angelfish
Variance/Mean Ratio: 2.1
Observed frequencies of the total score vs expected frequencies of different distributions.
## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"
General Linear Model fit and parameters using Poisson model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -186.78
## 2 9 -127.86 8 117.85 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 2.6508918 | 0.000 |
| 4 | data$site_nameDauin Poblacion District 1 | -3.0563569 | 0.000 |
| 3 | data$site_nameBasak | -0.3996000 | 0.020 |
| 7 | data$site_nameLutoban Pier | -0.3651138 | 0.031 |
| 9 | data$site_nameMalatapay Pier | -0.3155169 | 0.059 |
| 5 | data$site_nameGuinsuan | -0.2231436 | 0.170 |
| 6 | data$site_nameKookoo’s Nest | -0.2231436 | 0.170 |
| 8 | data$site_nameLutoban South | -0.0359320 | 0.816 |
| 2 | data$site_nameAntulang | 0.0232569 | 0.879 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 29.587, df = 8, p-value = 0.0002501
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Dauin Poblacion District 1-Andulay | 4 | 0.001 |
| Dauin Poblacion District 1-Antulang | 4 | 0.000 |
| Lutoban South-Dauin Poblacion District 1 | 4 | 0.003 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -11.5862
## 2 9 -4.1589 8 14.855 0.06203 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Keyhole Angelfish
Variance/Mean Ratio: 3.8
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Poisson model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -214.97
## 2 9 -116.51 8 196.92 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 2 | data$site_nameAntulang | 1.5040774 | 0.000 |
| 3 | data$site_nameBasak | 2.0476928 | 0.000 |
| 5 | data$site_nameGuinsuan | 2.3864666 | 0.000 |
| 7 | data$site_nameLutoban Pier | 1.7707061 | 0.000 |
| 8 | data$site_nameLutoban South | 2.2380466 | 0.000 |
| 9 | data$site_nameMalatapay Pier | 2.1400662 | 0.000 |
| 6 | data$site_nameKookoo’s Nest | 0.9162907 | 0.028 |
| 4 | data$site_nameDauin Poblacion District 1 | -0.9808293 | 0.147 |
| 1 | (Intercept) | 0.2876821 | 0.416 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 44.031, df = 8, p-value = 5.613e-07
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Dauin Poblacion District 1-Basak | 3 | 0.049 |
| Guinsuan-Andulay | 4 | 0.000 |
| Guinsuan-Dauin Poblacion District 1 | 5 | 0.000 |
| Kookoo’s Nest-Guinsuan | 4 | 0.010 |
| Lutoban South-Andulay | 4 | 0.008 |
| Lutoban South-Dauin Poblacion District 1 | 4 | 0.002 |
| Malatapay Pier-Andulay | 3 | 0.042 |
| Malatapay Pier-Dauin Poblacion District 1 | 4 | 0.013 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -22.6521
## 2 9 -6.8623 8 31.58 0.0001107 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-values for model parameters
| Pr(>|z|) | |
|---|---|
| (Intercept) | 1.000 |
| Antulang | 0.999 |
| Basak | 0.999 |
| Dauin Poblacion District 1 | 0.239 |
| Guinsuan | 0.999 |
| Kookoo’s Nest | 0.999 |
| Lutoban Pier | 0.999 |
| Lutoban South | 0.999 |
| Malatapay Pier | 0.999 |
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.
Normal Q-Q plot for the model residuals.
Panda Butterflyfish
Variance/Mean Ratio: 2.3
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Gaussian model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -157.69
## 2 10 -149.11 8 17.159 0.02849 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 9.6666667 | 0.000 |
| 4 | data$site_nameDauin Poblacion District 1 | -5.1666667 | 0.033 |
| 2 | data$site_nameAntulang | 3.3333333 | 0.169 |
| 5 | data$site_nameGuinsuan | -2.6666667 | 0.271 |
| 6 | data$site_nameKookoo’s Nest | -2.5000000 | 0.302 |
| 7 | data$site_nameLutoban Pier | -1.8333333 | 0.449 |
| 3 | data$site_nameBasak | 1.3333333 | 0.582 |
| 8 | data$site_nameLutoban South | -0.3333333 | 0.890 |
| 9 | data$site_nameMalatapay Pier | -0.1666667 | 0.945 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 15.015, df = 8, p-value = 0.05885
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -8.5542
## 2 9 -5.4067 8 6.2949 0.6142
Spot-Banded Butterflyfish
Variance/Mean Ratio: 6.1
Observed frequencies of the total score vs expected frequencies of different distributions.
## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"
General Linear Model fit and parameters using Poisson model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -269.15
## 2 9 -116.10 8 306.11 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 1.1526795 | 0.000 |
| 4 | data$site_nameDauin Poblacion District 1 | 1.3993664 | 0.000 |
| 6 | data$site_nameKookoo’s Nest | 1.5553707 | 0.000 |
| 8 | data$site_nameLutoban South | 1.4375877 | 0.000 |
| 3 | data$site_nameBasak | -1.8458267 | 0.003 |
| 2 | data$site_nameAntulang | -1.1526795 | 0.014 |
| 7 | data$site_nameLutoban Pier | 0.5520686 | 0.055 |
| 9 | data$site_nameMalatapay Pier | 0.3136576 | 0.299 |
| 5 | data$site_nameGuinsuan | -18.4552646 | 0.990 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 43.231, df = 8, p-value = 7.948e-07
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Dauin Poblacion District 1-Basak | 3 | 0.027 |
| Guinsuan-Dauin Poblacion District 1 | 4 | 0.010 |
| Kookoo’s Nest-Andulay | 3 | 0.049 |
| Kookoo’s Nest-Antulang | 4 | 0.005 |
| Kookoo’s Nest-Basak | 4 | 0.002 |
| Kookoo’s Nest-Guinsuan | 4 | 0.001 |
| Lutoban South-Antulang | 3 | 0.029 |
| Lutoban South-Basak | 4 | 0.010 |
| Lutoban South-Guinsuan | 4 | 0.004 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -35.594
## 2 9 -16.088 8 39.012 4.89e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-values for model parameters
| Pr(>|z|) | |
|---|---|
| (Intercept) | 1.000 |
| Antulang | 0.560 |
| Basak | 0.239 |
| Dauin Poblacion District 1 | 0.996 |
| Guinsuan | 0.996 |
| Kookoo’s Nest | 0.996 |
| Lutoban Pier | 0.239 |
| Lutoban South | 0.996 |
| Malatapay Pier | 0.239 |
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.
Normal Q-Q plot for the model residuals.
Triangular Butterflyfish
Variance/Mean Ratio: 3.4
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Negative Binomial model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -159.06
## 2 10 -135.94 8 46.241 2.139e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 1.6422277 | 0.000 |
| 6 | data$site_nameKookoo’s Nest | 0.9227216 | 0.000 |
| 9 | data$site_nameMalatapay Pier | -1.2367626 | 0.003 |
| 4 | data$site_nameDauin Poblacion District 1 | 0.7556675 | 0.005 |
| 2 | data$site_nameAntulang | -0.9490806 | 0.011 |
| 5 | data$site_nameGuinsuan | 0.6768867 | 0.012 |
| 3 | data$site_nameBasak | 0.2548922 | 0.373 |
| 8 | data$site_nameLutoban South | 0.2548922 | 0.373 |
| 7 | data$site_nameLutoban Pier | 0.1495317 | 0.607 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 31.583, df = 8, p-value = 0.0001106
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Dauin Poblacion District 1-Antulang | 3 | 0.028 |
| Kookoo’s Nest-Antulang | 4 | 0.003 |
| Malatapay Pier-Dauin Poblacion District 1 | 4 | 0.017 |
| Malatapay Pier-Kookoo’s Nest | 4 | 0.002 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -18.837
## 2 9 -10.681 8 16.311 0.03814 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-values for model parameters
| Pr(>|z|) | |
|---|---|
| (Intercept) | 0.998 |
| Antulang | 0.998 |
| Basak | 0.998 |
| Dauin Poblacion District 1 | 1.000 |
| Guinsuan | 1.000 |
| Kookoo’s Nest | 1.000 |
| Lutoban Pier | 1.000 |
| Lutoban South | 1.000 |
| Malatapay Pier | 0.998 |
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.
Normal Q-Q plot for the model residuals.
Hierarchical Cluster Analysis investigating species grouping across samples based on it’s presence/absence. Dissimilarities are calculated using the Jaccard index. It calculates the dissimilarity between two species i and j by counting the amount of samples that have both species and divide it by the total amount of samples and substract that number from 1. Based on that an average hierachical clustering is done to group the species.
## Creating a temporary cluster...done:
## socket cluster with 3 nodes on host 'localhost'
## Multiscale bootstrap... Done.
Clustering diagram
Table of the species clusters
| group | species |
|---|---|
| 1 | unknown_pres |
| 1 | unknown_pres.1 |
| 2 | epinephelus_fasciatus_pres |
| 2 | lutjanus_decussatus_pres |
| 2 | lutjanus_fulvus_pres |
| 2 | cephalopholis_argus_pres |
| 2 | scarus_niger_pres |
| 2 | macolor_macularis_juv_pres |
| 3 | plectorhinchus_polytaenia_pres |
| 3 | lutjanus_monostigma_pres |
| 4 | chlorurus_sordidus_ip_pres |
| 4 | scarus_dimidiatus_ip_pres |
| 5 | siganus_virgatus_pres |
| 5 | chlorurus_microrhinos_pres |
| 6 | plectorhinchus_vittatus_pres |
| 6 | plectorhinchus_lineatus_pres |
| 7 | diplorion_bifasciatum_pres |
| 7 | cephalopholis_cyanostigma_pres |
| 8 | siganus_punctatissimus_pres |
| 8 | scarus_hypselopterus_pres |
| 9 | epinephelus_ongus_juv_pres |
| 9 | variola_louti_juv_pres |
| 10 | plectorhinchus_lessonii_pres |
| 10 | plectorhinchus_chaetodonoides_pres |
| 11 | gracila_albomarginata_juv_pres |
| 11 | gracila_albomarginata_pres |
| 12 | scarus_schlegeli_ip_pres |
| 12 | scarus_psittacus_pres |
| 13 | lutjanus_quinquelineatus_pres |
| 13 | lutjanus_vitta_pres |
| 14 | cephalopholis_miniata_pres |
| 14 | lutjanus_rivulatus_pres |
| 15 | scarus_ghobban_ip_pres |
| 15 | scarus_forsteni_ip_pres |
| 16 | lutjanus_biguttatus_pres |
| 16 | chlorurus_bleekeri_ip_pres |
| 16 | scarus_flavipectoralis_ip_pres |
| 16 | cephalopholis_microprion_pres |
| 16 | variola_louti_pres |
| 17 | cetoscarus_ocellatus_ip_pres |
| 17 | variola_albimarginata_juv_pres |
| 18 | diagramma_pictum_juv_pres |
| 18 | diagramma_pictum_sub-adult_pres |
| 18 | plectorhinchus_vittatus_juv_pres |
Peacock Grouper
Variance/Mean Ratio: 2.9
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Gaussian model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -138.74
## 2 10 -111.13 8 55.206 4.027e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 14.8 | 0.000 |
| 5 | data$site_nameGuinsuan | -14.0 | 0.000 |
| 7 | data$site_nameLutoban Pier | -8.8 | 0.000 |
| 9 | data$site_nameMalatapay Pier | -7.2 | 0.000 |
| 8 | data$site_nameLutoban South | -6.8 | 0.001 |
| 3 | data$site_nameBasak | -3.4 | 0.093 |
| 4 | data$site_nameDauin Poblacion District 1 | -1.4 | 0.489 |
| 2 | data$site_nameAntulang | -1.0 | 0.621 |
| 6 | data$site_nameKookoo’s Nest | -0.8 | 0.692 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 30.668, df = 8, p-value = 0.0001609
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Guinsuan-Andulay | 4 | 0.003 |
| Guinsuan-Antulang | 3 | 0.035 |
| Kookoo’s Nest-Guinsuan | 4 | 0.008 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -15.6974
## 2 9 -8.3691 8 14.657 0.06617 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Blacktip Grouper
Variance/Mean Ratio: 1.5
Observed frequencies of the total score vs expected frequencies of different distributions.
## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"
General Linear Model fit and parameters using Gaussian model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -126.17
## 2 10 -108.71 8 34.922 2.763e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 15.0 | 0.000 |
| 4 | data$site_nameDauin Poblacion District 1 | -7.4 | 0.000 |
| 6 | data$site_nameKookoo’s Nest | -8.6 | 0.000 |
| 7 | data$site_nameLutoban Pier | -7.0 | 0.000 |
| 9 | data$site_nameMalatapay Pier | -4.2 | 0.028 |
| 3 | data$site_nameBasak | -2.6 | 0.175 |
| 8 | data$site_nameLutoban South | -2.2 | 0.251 |
| 5 | data$site_nameGuinsuan | -1.6 | 0.404 |
| 2 | data$site_nameAntulang | -1.0 | 0.602 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 25.076, df = 8, p-value = 0.001509
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Dauin Poblacion District 1-Andulay | 3 | 0.039 |
| Kookoo’s Nest-Andulay | 3 | 0.019 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -1.3053e-10
## 2 9 -1.3053e-10 8 0 1
Golden Rabbitfish
Variance/Mean Ratio: 8.4
Observed frequencies of the total score vs expected frequencies of different distributions.
## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"
General Linear Model fit and parameters using Poisson model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -196.509
## 2 9 -42.779 8 307.46 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 3 | data$site_nameBasak | 22.6821314 | 0.997 |
| 4 | data$site_nameDauin Poblacion District 1 | 22.8040212 | 0.997 |
| 1 | (Intercept) | -20.3025853 | 0.998 |
| 2 | data$site_nameAntulang | 19.3862945 | 0.998 |
| 6 | data$site_nameKookoo’s Nest | 21.4657361 | 0.998 |
| 5 | data$site_nameGuinsuan | 0.0000002 | 1.000 |
| 7 | data$site_nameLutoban Pier | 0.0000002 | 1.000 |
| 8 | data$site_nameLutoban South | 0.0000002 | 1.000 |
| 9 | data$site_nameMalatapay Pier | 0.0000002 | 1.000 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 39.008, df = 8, p-value = 4.898e-06
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Basak-Andulay | 3 | 0.019 |
| Dauin Poblacion District 1-Andulay | 4 | 0.009 |
| Guinsuan-Basak | 3 | 0.019 |
| Guinsuan-Dauin Poblacion District 1 | 4 | 0.009 |
| Lutoban Pier-Basak | 3 | 0.019 |
| Lutoban Pier-Dauin Poblacion District 1 | 4 | 0.009 |
| Lutoban South-Basak | 3 | 0.019 |
| Lutoban South-Dauin Poblacion District 1 | 4 | 0.009 |
| Malatapay Pier-Basak | 3 | 0.019 |
| Malatapay Pier-Dauin Poblacion District 1 | 4 | 0.009 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -28.643
## 2 9 -5.004 8 47.278 1.357e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-values for model parameters
| Pr(>|z|) | |
|---|---|
| (Intercept) | 0.999 |
| Antulang | 0.999 |
| Basak | 0.998 |
| Dauin Poblacion District 1 | 0.998 |
| Guinsuan | 1.000 |
| Kookoo’s Nest | 0.999 |
| Lutoban Pier | 1.000 |
| Lutoban South | 1.000 |
| Malatapay Pier | 1.000 |
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.
Normal Q-Q plot for the model residuals.
Hierarchical Cluster Analysis investigating species grouping across samples based on it’s presence/absence. Dissimilarities are calculated using the Jaccard index. It calculates the dissimilarity between two species i and j by counting the amount of samples that have both species and divide it by the total amount of samples and substract that number from 1. Based on that an average hierachical clustering is done to group the species.
## Creating a temporary cluster...done:
## socket cluster with 3 nodes on host 'localhost'
## Multiscale bootstrap... Done.
Clustering diagram
Table of the species clusters
| group | species |
|---|---|
| 1 | lethrinus_erythracanthus_pres |
| 1 | acanthurus_pyroferus_juv_pres |
| 2 | acanthurus_nigrofuscus_pres |
| 2 | acanthurus_mata_pres |
| 2 | ctenochaetus_binotatus_pres |
| 2 | naso_minor_pres |
| 2 | zebrasoma_scopas_pres |
| 2 | balistapus_undulatus_pres |
| 2 | balistoides_viridescens_pres |
| 2 | sufflamen_bursa_pres |
| 2 | pterocaesio_pisang_pres |
| 2 | scolopsis_bilineata_pres |
| 2 | arothron_nigropuncatus_pres |
| 2 | canthigaster_valentini_pres |
| 2 | canthigaster_papua_pres |
| 2 | acanthurus_pyroferus_pres |
| 3 | scolopsis_affinis_pres |
| 3 | sufflamen_chrysopterus_pres |
| 3 | scolopsis_affinis_juv_pres |
| 4 | ctenochaetus_tominiensis_pres |
| 4 | ostracion_solorensis_pres |
| 4 | acanthurus_thompsoni_pres |
| 4 | ctenochaetus_cyanocheilus_pres |
| 5 | monotaxis_heterodon_pres |
| 5 | lethrinus_obsoletus_pres |
| 6 | diodon_holocanthus_pres |
| 6 | naso_thynoides_pres |
| 7 | lethrinus_erythracanthus_juv_pres |
| 7 | lethrinus_erythropterus_pres |
| 8 | naso_hexacanthus_pres |
| 8 | gymnocranius_microdon_pres |
| 9 | pseudaluttarius_nasicornis_pres |
| 9 | arothron_manilensis_pres |
| 10 | zebrasoma_flavescens_pres |
| 10 | caesio_lunaris_pres |
| 10 | ctenochaetus_binotatus_juv_pres |
| 10 | paracanthurus_hepatus_pres |
Blackspotted Puffer
Variance/Mean Ratio: 3.2
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Gaussian model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -159.90
## 2 10 -146.81 8 26.179 0.0009786 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 6.3333333 | 0.000 |
| 4 | data$site_nameDauin Poblacion District 1 | 6.0000000 | 0.010 |
| 6 | data$site_nameKookoo’s Nest | 4.8333333 | 0.037 |
| 9 | data$site_nameMalatapay Pier | -3.1666667 | 0.172 |
| 8 | data$site_nameLutoban South | -2.6666667 | 0.250 |
| 7 | data$site_nameLutoban Pier | 1.1666667 | 0.615 |
| 5 | data$site_nameGuinsuan | 1.0000000 | 0.666 |
| 2 | data$site_nameAntulang | -0.6666667 | 0.774 |
| 3 | data$site_nameBasak | -0.5000000 | 0.829 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 18.613, df = 8, p-value = 0.01707
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Malatapay Pier-Dauin Poblacion District 1 | 3 | 0.045 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -16.659
## 2 9 -10.341 8 12.634 0.1251
Titan Triggerfish
Variance/Mean Ratio: 2.9
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Zero-Inflated Poisson model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -143.62
## 2 18 -117.73 16 51.766 1.195e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| group | coefficient | value | pvalue | |
|---|---|---|---|---|
| 1 | count | (Intercept) | 1.8971200 | 0.000 |
| 8 | count | data$site_nameLutoban South | -1.4311096 | 0.006 |
| 3 | count | data$site_nameBasak | 0.4219944 | 0.045 |
| 7 | count | data$site_nameLutoban Pier | -0.4286010 | 0.122 |
| 5 | count | data$site_nameGuinsuan | -0.4048218 | 0.170 |
| 2 | count | data$site_nameAntulang | -0.3215836 | 0.196 |
| 4 | count | data$site_nameDauin Poblacion District 1 | 0.1392919 | 0.598 |
| 9 | count | data$site_nameMalatapay Pier | -0.0746284 | 0.758 |
| 6 | count | data$site_nameKookoo’s Nest | -0.0512934 | 0.822 |
| 10 | zero | (Intercept) | -19.7445376 | 0.998 |
| 13 | zero | data$site_nameDauin Poblacion District 1 | 19.7435969 | 0.998 |
| 14 | zero | data$site_nameGuinsuan | 19.0156279 | 0.998 |
| 16 | zero | data$site_nameLutoban Pier | 18.0538928 | 0.998 |
| 17 | zero | data$site_nameLutoban South | 18.1108992 | 0.998 |
| 18 | zero | data$site_nameMalatapay Pier | 18.1226874 | 0.998 |
| 11 | zero | data$site_nameAntulang | -0.0000002 | 1.000 |
| 12 | zero | data$site_nameBasak | -0.0000002 | 1.000 |
| 15 | zero | data$site_nameKookoo’s Nest | -0.0000002 | 1.000 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 19.788, df = 8, p-value = 0.01117
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Lutoban South-Basak | 4 | 0.007 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -24.330
## 2 9 -17.204 8 14.253 0.0754 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Ornate Emperor
Variance/Mean Ratio: 5.9
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Zero-Inflated Poisson model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -149.75
## 2 18 -100.22 16 99.054 5.208e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| group | coefficient | value | pvalue | |
|---|---|---|---|---|
| 1 | count | (Intercept) | 2.6625879 | 0.000 |
| 5 | count | data$site_nameGuinsuan | -1.5298305 | 0.000 |
| 7 | count | data$site_nameLutoban Pier | -1.2963203 | 0.000 |
| 3 | count | data$site_nameBasak | -0.6931472 | 0.001 |
| 2 | count | data$site_nameAntulang | -0.5031037 | 0.007 |
| 8 | count | data$site_nameLutoban South | -1.6253411 | 0.016 |
| 9 | count | data$site_nameMalatapay Pier | -1.2963185 | 0.020 |
| 4 | count | data$site_nameDauin Poblacion District 1 | -1.0601512 | 0.030 |
| 6 | count | data$site_nameKookoo’s Nest | -0.2954643 | 0.082 |
| 10 | zero | (Intercept) | -19.6146680 | 0.998 |
| 13 | zero | data$site_nameDauin Poblacion District 1 | 21.2156999 | 0.998 |
| 14 | zero | data$site_nameGuinsuan | 18.7769782 | 0.998 |
| 16 | zero | data$site_nameLutoban Pier | 19.5742068 | 0.998 |
| 17 | zero | data$site_nameLutoban South | 21.1499991 | 0.998 |
| 18 | zero | data$site_nameMalatapay Pier | 21.2000263 | 0.998 |
| 11 | zero | data$site_nameAntulang | -0.0000001 | 1.000 |
| 12 | zero | data$site_nameBasak | -0.0000001 | 1.000 |
| 15 | zero | data$site_nameKookoo’s Nest | -0.0000001 | 1.000 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 39.83, df = 8, p-value = 3.447e-06
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Dauin Poblacion District 1-Andulay | 4 | 0.002 |
| Guinsuan-Andulay | 3 | 0.050 |
| Lutoban Pier-Andulay | 3 | 0.025 |
| Lutoban South-Andulay | 4 | 0.001 |
| Lutoban South-Kookoo’s Nest | 3 | 0.032 |
| Malatapay Pier-Andulay | 4 | 0.001 |
| Malatapay Pier-Kookoo’s Nest | 3 | 0.042 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -35.594
## 2 9 -16.088 8 39.012 4.89e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-values for model parameters
| Pr(>|z|) | |
|---|---|
| (Intercept) | 0.996 |
| Antulang | 1.000 |
| Basak | 1.000 |
| Dauin Poblacion District 1 | 0.996 |
| Guinsuan | 0.997 |
| Kookoo’s Nest | 1.000 |
| Lutoban Pier | 0.996 |
| Lutoban South | 0.996 |
| Malatapay Pier | 0.996 |
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.
Normal Q-Q plot for the model residuals.
Bluestreak Fusilier
Variance/Mean Ratio: 4.8
Observed frequencies of the total score vs expected frequencies of different distributions.
## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"
General Linear Model fit and parameters using Negative Binomial model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -105.924
## 2 10 -88.581 8 34.685 3.052e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 2 | data$site_nameAntulang | 1.4087672 | 0.032 |
| 8 | data$site_nameLutoban South | -1.7047481 | 0.074 |
| 3 | data$site_nameBasak | 1.1284653 | 0.088 |
| 1 | (Intercept) | 0.6061358 | 0.225 |
| 7 | data$site_nameLutoban Pier | 0.7801586 | 0.245 |
| 5 | data$site_nameGuinsuan | 0.3101549 | 0.653 |
| 4 | data$site_nameDauin Poblacion District 1 | -0.2006707 | 0.781 |
| 6 | data$site_nameKookoo’s Nest | -19.9087209 | 0.996 |
| 9 | data$site_nameMalatapay Pier | -19.9087209 | 0.996 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 28.65, df = 8, p-value = 0.0003651
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Kookoo’s Nest-Antulang | 4 | 0.007 |
| Lutoban South-Antulang | 3 | 0.024 |
| Malatapay Pier-Antulang | 4 | 0.007 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -37.096
## 2 9 -19.907 8 34.377 3.47e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-values for model parameters
| Pr(>|z|) | |
|---|---|
| (Intercept) | 1.000 |
| Antulang | 0.996 |
| Basak | 0.239 |
| Dauin Poblacion District 1 | 0.239 |
| Guinsuan | 0.560 |
| Kookoo’s Nest | 0.996 |
| Lutoban Pier | 0.560 |
| Lutoban South | 0.239 |
| Malatapay Pier | 0.996 |
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.
Normal Q-Q plot for the model residuals.
Hierarchical Cluster Analysis investigating species grouping across samples based on it’s presence/absence. Dissimilarities are calculated using the Jaccard index. It calculates the dissimilarity between two species i and j by counting the amount of samples that have both species and divide it by the total amount of samples and substract that number from 1. Based on that an average hierachical clustering is done to group the species.
## Creating a temporary cluster...done:
## socket cluster with 3 nodes on host 'localhost'
## Multiscale bootstrap... Done.
Clustering diagram
Table of the species clusters
| group | species |
|---|---|
| 1 | wetmorella_albofasciata_pres |
| 1 | anampses_geographicus_pres |
| 2 | coris_batuensis_pres |
| 2 | cirrhilabrus_cyanopleura_pres |
| 2 | cirrhilabrus_cyanopleura_juv_pres |
| 2 | halichoeres_prosopeion_pres |
| 2 | halichoeres_hortulanus_pres |
| 2 | oxycheilinus_digrammus_pres |
| 2 | thalassoma_lunare_pres |
| 2 | stethojulis_interrupta_pres |
| 2 | pseudocheilinus_evanidus_pres |
| 2 | bodianus_mesothorax_pres |
| 2 | labrioides_dimidiatus_pres |
| 2 | labrioides_dimidiatus_juv_pres |
| 2 | parupeneus_multifasciatus_pres |
| 2 | parupeneus_multifasciatus_juv_pres |
| 2 | parupeneus_barberinus_pres |
| 2 | parupeneus_barberinus_juv_pres |
| 2 | thalassoma_lunare_juv_pres |
| 3 | halichoeres_zeylonicus_pres |
| 3 | coris_gaimard_pres |
| 4 | cheilinus_chlorourus_pres |
| 4 | oxycheilinus_celebicus_pres |
| 5 | parupeneus_crassilabris_juv_pres |
| 5 | stethojulis_trilineata_pres |
| 6 | gomphosus_varius_pres |
| 6 | thalassoma_hardwicke_pres |
| 6 | gomphosus_varius_juv_pres |
| 7 | hemigymnus_melapterus_pres |
| 7 | halichoeres_melanochir_pres |
| 8 | cheilio_inermis_pres |
| 8 | parupeneus_barberinoides_juv_pres |
| 9 | bodianus_mesothorax_juv_pres |
| 9 | pseudodax_mollocanus_pres |
| 10 | halichoeres_richmondi_pres |
| 10 | cheilinus_fasciatus_pres |
| 10 | labrichthys_unileatus_juv_pres |
| 11 | anampses_melanurus_pres |
| 11 | anampses_melanurus_juv_pres |
| 12 | halichoeres_scapularis_juv_pres |
| 12 | hologymnosus_annulatus_pres |
| 13 | hemigymnus_melapterus_juv_pres |
| 13 | labropsis_alleni_pres |
| 14 | mulloidichthys_vanicolensis_pres |
| 14 | pseudocoris_bleekeri_pres |
| 15 | coris_batuensis_juv_pres |
| 15 | stethojulis_trilineata_juv_pres |
| 16 | halichoeres_podostigma_juv_pres |
| 16 | anampses_twistii_pres |
Redfin Hogfish
Variance/Mean Ratio: 3.6
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Zero-Inflated Poisson model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -158.26
## 2 18 -122.18 16 72.167 4.152e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| group | coefficient | value | pvalue | |
|---|---|---|---|---|
| 1 | count | (Intercept) | 2.2335922 | 0.000 |
| 3 | count | data$site_nameBasak | 0.3813677 | 0.034 |
| 6 | count | data$site_nameKookoo’s Nest | -0.4700032 | 0.036 |
| 4 | count | data$site_nameDauin Poblacion District 1 | -0.4788207 | 0.045 |
| 9 | count | data$site_nameMalatapay Pier | -0.8673259 | 0.116 |
| 5 | count | data$site_nameGuinsuan | 0.2787133 | 0.124 |
| 7 | count | data$site_nameLutoban Pier | -0.2886007 | 0.342 |
| 8 | count | data$site_nameLutoban South | -0.1541504 | 0.438 |
| 2 | count | data$site_nameAntulang | -0.1133289 | 0.564 |
| 10 | zero | (Intercept) | -20.5770960 | 0.999 |
| 13 | zero | data$site_nameDauin Poblacion District 1 | 18.9489987 | 0.999 |
| 16 | zero | data$site_nameLutoban Pier | 21.2688750 | 0.999 |
| 18 | zero | data$site_nameMalatapay Pier | 22.1624518 | 0.999 |
| 11 | zero | data$site_nameAntulang | -0.0000001 | 1.000 |
| 12 | zero | data$site_nameBasak | -0.0000001 | 1.000 |
| 14 | zero | data$site_nameGuinsuan | -0.0000001 | 1.000 |
| 15 | zero | data$site_nameKookoo’s Nest | -0.0000001 | 1.000 |
| 17 | zero | data$site_nameLutoban South | -0.0000001 | 1.000 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 33.956, df = 8, p-value = 4.137e-05
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Lutoban Pier-Basak | 4 | 0.005 |
| Lutoban Pier-Guinsuan | 3 | 0.027 |
| Malatapay Pier-Basak | 4 | 0.000 |
| Malatapay Pier-Guinsuan | 4 | 0.003 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -25.8749
## 2 9 -9.2258 8 33.298 5.441e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-values for model parameters
| Pr(>|z|) | |
|---|---|
| (Intercept) | 0.998 |
| Antulang | 1.000 |
| Basak | 1.000 |
| Dauin Poblacion District 1 | 0.998 |
| Guinsuan | 1.000 |
| Kookoo’s Nest | 1.000 |
| Lutoban Pier | 0.998 |
| Lutoban South | 1.000 |
| Malatapay Pier | 0.998 |
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.
Normal Q-Q plot for the model residuals.
Leopard Wrasse
Variance/Mean Ratio: 4.7
Observed frequencies of the total score vs expected frequencies of different distributions.
## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"
General Linear Model fit and parameters using Negative Binomial model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -118.17
## 2 10 -104.88 8 26.584 0.000834 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 1.8458267 | 0.000 |
| 7 | data$site_nameLutoban Pier | -2.2512918 | 0.003 |
| 5 | data$site_nameGuinsuan | -1.4403616 | 0.033 |
| 9 | data$site_nameMalatapay Pier | -1.2396909 | 0.060 |
| 8 | data$site_nameLutoban South | -0.9985288 | 0.121 |
| 4 | data$site_nameDauin Poblacion District 1 | -0.8043728 | 0.205 |
| 2 | data$site_nameAntulang | -0.4595323 | 0.459 |
| 3 | data$site_nameBasak | 0.2744368 | 0.649 |
| 6 | data$site_nameKookoo’s Nest | -21.1484118 | 0.996 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 23.091, df = 8, p-value = 0.00325
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Kookoo’s Nest-Andulay | 3 | 0.036 |
| Kookoo’s Nest-Basak | 4 | 0.016 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -37.282
## 2 9 -23.386 8 27.79 0.0005158 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-values for model parameters
| Pr(>|z|) | |
|---|---|
| (Intercept) | 0.994 |
| Antulang | 0.995 |
| Basak | 0.995 |
| Dauin Poblacion District 1 | 0.995 |
| Guinsuan | 0.994 |
| Kookoo’s Nest | 0.992 |
| Lutoban Pier | 0.994 |
| Lutoban South | 0.995 |
| Malatapay Pier | 0.994 |
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.
Normal Q-Q plot for the model residuals.
Linedcheeked Wrasse
Variance/Mean Ratio: 2.7
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Gaussian model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -162.78
## 2 10 -138.75 8 48.056 9.641e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 12.3333333 | 0.000 |
| 5 | data$site_nameGuinsuan | -9.6666667 | 0.000 |
| 9 | data$site_nameMalatapay Pier | -8.1666667 | 0.000 |
| 3 | data$site_nameBasak | -6.6666667 | 0.001 |
| 6 | data$site_nameKookoo’s Nest | -2.0000000 | 0.317 |
| 4 | data$site_nameDauin Poblacion District 1 | -1.5000000 | 0.453 |
| 2 | data$site_nameAntulang | 1.3333333 | 0.505 |
| 7 | data$site_nameLutoban Pier | -0.5000000 | 0.802 |
| 8 | data$site_nameLutoban South | -0.3333333 | 0.868 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 28.113, df = 8, p-value = 0.0004533
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Guinsuan-Antulang | 4 | 0.005 |
| Malatapay Pier-Antulang | 3 | 0.045 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -18.837
## 2 9 -10.681 8 16.311 0.03814 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-values for model parameters
| Pr(>|z|) | |
|---|---|
| (Intercept) | 0.998 |
| Antulang | 1.000 |
| Basak | 0.998 |
| Dauin Poblacion District 1 | 1.000 |
| Guinsuan | 0.998 |
| Kookoo’s Nest | 1.000 |
| Lutoban Pier | 1.000 |
| Lutoban South | 1.000 |
| Malatapay Pier | 0.998 |
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.
Normal Q-Q plot for the model residuals.
Cutribbon Wrasse
Variance/Mean Ratio: 2.9
Observed frequencies of the total score vs expected frequencies of different distributions.
General Linear Model fit and parameters using Negative Binomial model
## Likelihood ratio test
##
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -161.05
## 2 10 -140.32 8 41.448 1.719e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
| coefficient | value | pvalue | |
|---|---|---|---|
| 1 | (Intercept) | 2.3353749 | 0.000 |
| 6 | data$site_nameKookoo’s Nest | -1.2367626 | 0.000 |
| 7 | data$site_nameLutoban Pier | -2.1812242 | 0.000 |
| 4 | data$site_nameDauin Poblacion District 1 | -0.3894648 | 0.142 |
| 3 | data$site_nameBasak | -0.2353141 | 0.363 |
| 2 | data$site_nameAntulang | -0.1381503 | 0.588 |
| 5 | data$site_nameGuinsuan | -0.1017827 | 0.688 |
| 8 | data$site_nameLutoban South | -0.0666914 | 0.792 |
| 9 | data$site_nameMalatapay Pier | 0.0000000 | 1.000 |
Plots for evaluation of model conditions.
Krusall-Wallis test, making no assumptions on the data distribution.
##
## Kruskal-Wallis rank sum test
##
## data: data$y by data$site_name
## Kruskal-Wallis chi-squared = 23.601, df = 8, p-value = 0.002672
Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.
| site | difference | p_adj |
|---|---|---|
| Lutoban Pier-Andulay | 3 | 0.042 |
| Lutoban South-Lutoban Pier | 3 | 0.048 |
| Malatapay Pier-Lutoban Pier | 3 | 0.034 |
Logistic regression testing the chance the species is encountered at each site.
## Likelihood ratio test
##
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 1 -14.2588
## 2 9 -9.2258 8 10.066 0.2604
## [1] "Unkown indicator species:"
## [1] "cheilodipterus_quinquelineatus" "fistularia_commersonii"
## [3] "myripristis_botche" "naso_unicornis"
## [5] "platax_pinnatus" "pterois_volitans"
## [7] "cheilinus_undulatus" "labrichthys_unileatus"
| family | site | div | |
|---|---|---|---|
| 815 | Indicator Species | Guinsuan | 3.793934 |
| 818 | Indicator Species | Lutoban South | 3.884768 |
| 819 | Indicator Species | Malatapay Pier | 3.887487 |
| 817 | Indicator Species | Lutoban Pier | 3.912158 |
| 816 | Indicator Species | Kookoo’s Nest | 3.952451 |
| 813 | Indicator Species | Basak | 3.972587 |
| 811 | Indicator Species | Andulay | 3.987899 |
| 812 | Indicator Species | Antulang | 4.004278 |
| 814 | Indicator Species | Dauin Poblacion District 1 | 4.095858 |
| 826 | Total | Lutoban Pier | 4.888411 |
| 827 | Total | Lutoban South | 4.894894 |
| 824 | Total | Guinsuan | 4.907122 |
| 825 | Total | Kookoo’s Nest | 4.965014 |
| 828 | Total | Malatapay Pier | 4.973251 |
| 820 | Total | Andulay | 5.042729 |
| 821 | Total | Antulang | 5.111224 |
| 822 | Total | Basak | 5.130003 |
| 823 | Total | Dauin Poblacion District 1 | 5.163217 |
Hill, Josh, and Clive Wilkinson. 2004. “Methods for ecological monitoring of coral reefs.” Australian Institute of Marine Science, Townsville, 117. doi:10.1017/CBO9781107415324.004.
Suzuki, R., and H. Shimodaira. 2015. “Package ‘ pvclust ’.” R Topics Documented, 14. http://www.sigmath.es.osaka-u.ac.jp/shimo-lab/prog/pvclust/.